An Efficient Data Fingerprint Query Algorithm Based on Two-Leveled Bloom Filter

نویسندگان

  • Bin Zhou
  • Rongbo Zhu
  • Ying Zhang
  • Linhui Cheng
چکیده

The function of the comparing fingerprints algorithm was to judge whether a new partitioned data chunk was in a storage system a decade ago. At present, in the most de-duplication backup system the fingerprints of the big data chunks are huge and cannot be stored in the memory completely. The performance of the system is unavoidably retarded by data chunks accessing the storage system at the querying stage. Accordingly, a new query mechanism namely Two-stage Bloom Filter (TBF) mechanism is proposed. Firstly, as a representation of the entirety for the first grade bloom filter, each bit of the second grade bloom filter in the TBF represents the chunks having the identical fingerprints reducing the rate of false positives. Secondly, a two-dimensional list is built corresponding to the two grade bloom filter for the absolute addresses of the data chunks with the identical fingerprints. Finally, a new hash function class with the strong global random characteristic is set up according to the data fingerprints’ random characteristics. To reduce the comparing data greatly, TBF decreases the number of accessing disks, improves the speed of detecting the redundant data chunks, and reduces the rate of false positives which helps the improvement of the overall performance of system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Cuckoo Filter Modification Inspired by Bloom Filter

Probabilistic data structures are so popular in membership queries, network applications, and so on. Bloom Filter and Cuckoo Filter are two popular space efficient models that incorporate in set membership checking part of many important protocols. They are compact representation of data that use hash functions to randomize a set of items. Being able to store more elements while keeping a reaso...

متن کامل

Network Forensics on Packet Fingerprints

We present an approach to network forensics that makes it feasible to trace the content of all traffic that passed through the network via packet content fingerprints. We develop a new data structure called the “Rolling Bloom Filter” (RBF), which is based on a generalization of the Rabin-Karp stringmatching algorithm. This merges the two key advantages of space efficiency and an efficient conte...

متن کامل

Cuckoo Filter: Simplification and Analysis

The cuckoo filter data structure of Fan, Andersen, Kaminsky, and Mitzenmacher (CoNEXT 2014) performs the same approximate set operations as a Bloom filter in less memory, with better locality of reference, and adds the ability to delete elements as well as to insert them. However, until now it has lacked theoretical guarantees on its performance. We describe a simplified version of the cuckoo f...

متن کامل

Robust Detection and Tracking of Long-range Target in a Compound Framework Kang Sun and Xinwei Li A Study on the Using Behavior of Depot-Logistic Information System in Taiwan: An Integration of Satisfaction Theory and Technology Acceptance Theory

The function of the comparing fingerprints algorithm was to judge whether a new partitioned data chunk was in a storage system a decade ago. At present, in the most de-duplication backup system the fingerprints of the big data chunks are huge and cannot be stored in the memory completely. The performance of the system is unavoidably retarded by data chunks accessing the storage system at the qu...

متن کامل

The Power of 1 + α for Memory-Efficient Bloom Filters

This paper presents a cache-aware Bloom Filter algorithm with improved cache behavior and a lower false positive rates compared to prior work. The algorithm relies on the power-of-two choice principle to provide a better distribution of set elements in a Blocked Bloom Filter. Instead of choosing a single block, we insert new elements into the least-loaded of two blocks to achieve a low false-po...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Multimedia

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2013